Skip to main content

All Questions

1vote
3answers
1kviews

Comparing whether two very large text contents are different or not efficiently

I have a MySQL database with a column Body MEDIUMTEXT. Until now I used to only store the contents into it. There was no update option for the users of the application. Now, I wanted to add an update ...
SkrewEverything's user avatar
0votes
1answer
117views

I have a data set of over a million addresses and I want to display the closest N locations to a given address or current location

I am a student working on a personal project which is essentially a location finder that will be on a website. I have a data set of over a million addresses and I want to display the closest N ...
Jason Hong's user avatar
7votes
3answers
2kviews

System Design: Very Large CSV Imports Every Month

We have a webapp that will rely on large CSVs from external vendors every month. When I say large, we are looking at around 6gb so a few million rows. Probably, 2-5 CSVs. This webapp will also allow ...
user2370642's user avatar
6votes
3answers
2kviews

Deduplication of complex records / Similarity Detection

I'm working on a project that involves records with fairly large numbers of fields (~15-20) and I'm trying to figure out a good way to implement deduplication. Essentially the records are people along ...
Tomdarkness's user avatar
3votes
1answer
311views

Processing every leaf under a node in a tree efficiently

Short version: In a tree (non-binary) with many levels of children, where each node can have multiple leaves, what is the best way to tally leaves that meet a certain condition given a node? Long, ...
SB2055's user avatar

close